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We develop a theory of ergodicity for a class of random dynamical 
systems where the driving noise is not white. The two main tools of 
our analysis are the strong Feller property and topological irreducibil- 
ity, introduced in this work for a class of non-Markovian systems. 
They allow us to obtain a criteria for ergodicity which is similar in 
nature to the Doob—Khas'minskii theorem. 

The second part of this article shows how it is possible to apply 
these results to the case of stochastic differential equations driven 
by fractional Brownian motion. It follows that under a nondegener- 
acy condition on the noise, such equations admit a unique adapted 
stationary solution. 

1. Introduction. Ergodic properties of Markovian systems have been in- 
tensively studied, especially in the context of stochastic differential equa- 
tions (henceforth abbreviated as SDEs). Many authors have been studying 
the problem of ergodicity for Markovian systems induced by finite- and 
infinite-dimensional stochastic equations driven by a Brownian motion. A 
good summary of the current state of research in this area can be found in 
the monographs [4, 8, 13]. The asymptotic behavior of processes driven by 
a noise with nontrivial time correlations seems to be much less well under- 
stood, although substantial progress has been made in the framework of the 
theory of random dynamical systems [2]. However, framework takes a rather 
"deterministic" approach and is mainly suitable for the study of the random 
equivalent of the objects from the theory of ordinary dynamical systems. A 
natural question is whether one can take a more "probabilistic" approach 
and obtain statements that are similar in spirit and in scope to the ones 
obtained in the Markovian case. This is the program that we start to carry 
out in this work. Our main goal is to obtain a criterion for the existence and 
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uniqueness of an "invariant measure" (in a sense to be made precise) that 
are comparable in scope to the existing criteria for Markov processes. 

More precisely, we are interested in providing a generalization of the 
widely used result attributed to Doob and Khas'minskii which states that 
a Markov process which is strong Feller and topologically irreducible can 
have at most one invariant measure (see, e.g., [4], Proposition 4.1.1 and 
Theorem 4.2.1). The obvious question that arises is how to formulate a use- 
ful generalization of the strong Feller property in non-Markovian situations. 
This question will be answered, to a certain extent, in the framework of 
"stochastic dynamical systems" (SDS) developed in [7]. Roughly speaking, 
an SDS is simply a random dynamical system which is reformulated in such 
a way that one sees how new randomness comes into the system as time 
evolves. One characteristic of this point-of-view is that it automatically dis- 
cards invariant measures that are not measurable with respect to the past; 
see [2] for this terminology. Note that this is actually a desirable feature if 
one wishes to obtain a natural generalization of the concept of "invariant 
measure" from the theory of Markov processes. For example, in the case 
of a diffusion on the circle with a nontrivial drift, the theory of Markov 
processes yields the existence of a unique invariant measure. The theory of 
random dynamical systems, on the other hand, yields two distinct invariant 
measures, but one of them is measurable with respect to the future and 
corresponds to an unstable random fixed point. Even though such invariant 
measures correspond to stationary solutions of the corresponding SDE, they 
are "unphysical" in the sense that they can only be realized by preparing the 
initial condition in a state that depends on the whole future of the driving 
noise. The main result of this first, "abstract," part of the present article is 
Theorem 3.10 below. 

As a test of the relevance of our criteria, we then show that it can be 
applied to the case of SDEs driven by fractional Brownian motion (fBm). 
The choice of fractional noise as driving noise (rather than, e.g., an Ornstein- 
Uhlenbeck process) is motivated by the following arguments: 

1. one cannot reduce it to a Markovian situation without adding infinitely 
many degrees of freedom; 

2. it presents long range correlations and therefore does not reduce to white 
noise in the limit of large time rescalings; 

3. it is very well studied, so that many a priori estimates are available in 
the existing literature; 

4. it appears naturally as the only continuous scale-invariant Gaussian pro- 
cess. 

This article should be considered as a sequel to the work [7], where SDEs 
driven by additive fractional noise were considered. In this situation, a cou- 
pling argument allowed it to be shown that such SDEs possess a unique 
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invariant measure under quite general conditions. Unfortunately, this argu- 
ment presented two major drawbacks. First, it was very difficult to follow 
and hard to analyze because of the long-range correlations of the driving 
noise. Second, the coupling construction used the additivity of the noise in 
an essential way, making the argument unsuitable to the study of equations 
driven by multiplicative noise. 

In this work, we consider equations driven by nondegenerate multiplica- 
tive noise, that is, we study the SDE 

(1.1) dx t = f{x t )dt + a(xt)dB H (t), x(0) = x € R d , 

where / : R rf — > H d , a : H d — > M^xd (where M^xd denotes the space of d x d 
matrices) and Bh is a (i-dimensional fractional Brownian motion with Hurst 
parameter H. In other words, it is a centered d-dimensional Gaussian process 
with continuous sample paths, Bh(0) = and covariance 

E(i£(f) - B H (s))(B H (t) " B j H (s)) = <%|t - s\ 2H 

for t, s € R and i,j = l,...,d. We will assume throughout this work that H 
is strictly greater than 1/2 so that the integral appearing in the right-hand 
side of (1.1) may be considered pathwise as a Riemann-Stieltjes integral. We 
believe that this restriction could be weakened by considering noise spaces 
of "rough path" type (see, e.g., [6, 12]), but this would raise additional dif- 
ficulties that we do not wish to address here. A pair (x, Bh) of continuous 
stochastic processes is called a solution to (1.1) if Bjj is a fBm and the inte- 
grated form of (1.1) holds almost surely for all times. We call such a solution 
adapted if for every t, x(t) and {Bji(s)} s >t are conditionally independent, 
given {B H (s)} s < t . 

In order to ensure the global existence of solutions and in order to have 
some control over it, we make, for most of this paper, the following assump- 
tions on the coefficients / and a. 

(HI) Regularity: Both / and a are C°°. Furthermore, the diffusion coef- 
ficient g and the derivatives of / and a are globally bounded: 

(1.2) sup (|<t(x)| + \Df(x)\ + \Da{x)\ < oo). 

(H2) Nondegeneracy: o~(x) 6 M^xd is invertible and sup^Rd |<t -1 (x)| < 

oo. 

(H3) Dissipativity: There exists C > such that 

(f(x),x) <C(l-\\x\\ 2 ) VxeR d . 
The main result of this paper is the following. 

Theorem 1.1. If the coefficients of the SDE (1.1) satisfy assumptions 
(H1)-(H3), then it has exactly one adapted stationary solution. 



4 



M. HAIRER AND A. OHASHI 



The remainder of this paper is organized in the following way. After fixing 
the notation and recalling some results from [7] in Section 2, we formulate 
and state the main abstract result in Section 3. Section 4 is devoted to 
ensuring that the abstract framework constructed in Section 2 can be applied 
to the SDE (1.1). It also provides the a priori bounds required to ensure the 
existence of an invariant measure for such systems. We then spend most 
of Section 5 proving that the generalization of the strong Feller property 
formulated in Section 3 does indeed hold for (1.1). This allows us to obtain 
Theorem 1.1 simple corollary. 

2. Preliminaries. In this section, we fix the basic notation that we use 
in this paper and recall some basic definitions and results from [7]. Given a 
product space X x y, we denote by Tlx and ILy the projections on X and 
y, respectively. Also, given two measurable spaces £ and a measurable 
map f:£—> T and a measure /U on £, we define the measure f*pt on T in 
the natural way by f*fi = (jlo We denote by 5 X the usual delta measure 
located at x £ X. We also denote by Aii(X) and M. + (X) the set of probabil- 
ity measures and positive finite measures on X, respectively. We endow both 
sets with the topology of weak convergence. If X is a Polish space, then we 
denote by C([0,T], X) the space of continuous functions /: [0,T] — ► X. We 
endow this space with the usual topology of uniform convergence. 

We first define the structure of the class of noise processes that we are 
going to work with. 

Definition 2.1. A quadruple (W, {V t }t>o, P W) {Gt}t>o) is called a sta- 
tionary noise process if it satisfies the following: 

(i) W is a Polish space; 

(ii) Vt is a Feller transition semigroup on W which accepts P w as its 
unique invariant measure; 

(iii) the family {9t}t>o is a semiflow of measurable maps on W satisfying 
the property 6^V t (x, ■) = 5 X for every x 6 W and every t > 0. 

The following definition is taken from [7] and provides the general frame- 
work in which we are going to address the question of ergodicity. 

Definition 2.2. A continuous stochastic dynamical system (SDS) on 
the Polish space X over the stationary noise process (W, {Vt}t>o, Pcj, {#t }t>o) 
is a map 

A:~R + x X xW ^ X, (t,x,w) i-> A t (x,w), 
with the following properties. 
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(i) Regularity of paths: For every T > 0, x € X and w £ W, the map 
$r(x,to) : [0,T] -> Af defined by 

= A t (x,e T - t w) 

belongs to C([0,T],Af). 

(ii) Continuous dependence: The map (x,w) *— >^t{x,w) is continuous 
from X x W to C([0, T], AT) for every T > 0. 

(iii) Cocycle property: The family of maps At satisfies 

A (x,w) = x, 

(2.1) 

A s+t (x,w) = A s (A t (x,6 s w),w), 
for all s, t > 0, all x G A? and all u> € W. 

Given an SDS as in Definition 2.2 and an initial condition xq € X, we now 
show how to use it to construct in a natural way an AWalued stochastic 
process with initial condition xq. First, given t > and (x,w) 6 X x W, we 
construct a probability measure Qt(x, •) on X x W by 

(2.2) Q t (x,w;AxB)= [ 8 Kt{XiWl) (A)V t (w,dw% 

where 8 X denotes the delta measure located at x, A is a measurable subset 
of X and B is a measurable subset of W. One can show [7], Lemma 2.12, 
that the family of measures Qt(x,w;-) actually forms a Feller transition 
semigroup on X x W and if a probability measure fj, on A 7 x W satisfies 
LTyy/i = P w , then Ilyy Q t /i = for all t > 0. This suggests the following 
definition. 

Definition 2.3. Let A be an SDS as above. Then a probability measure 
fj, on X x W is called a generalized initial condition for A if n^/u = P w . We 
denote by .Ma the space of generalized initial conditions endowed with the 
topology of weak convergence. Elements of Ai\ of the form fj, = 5 x X P w for 
some x € X will be called initial conditions. 

Given a generalized initial condition fj,, we construct a stochastic process 
(xt,wt) on Af x W by drawing its initial condition according to fx and then 
evolving it according to the transition semigroup Qt- The marginal xt of 
this process on X will be called the process generated by A for [i. We will 
denote by Q/j, the law of this process [i.e., Q/j, is a measure on C(R+, X)\. 

Definition 2.4. Two generalized initial conditions fj, and v of an SDS 
A are equivalent if the processes generated by fx and v are equal in law. In 
short, n~ v Q/j, = Qv. 
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We say that a generalized initial condition \i is invariant for the SDS A 
if it is invariant for the Markov transition semigroup Qt generated by A. 
Similarly, we call it ergodic if it is ergodic for Q t , that is, if the law of the 
stationary Markov process on X x W with transition probabilities Qt and 
fixed-time marginal fj, is ergodic for the shift map. 

The following remark turns out to be very useful for the approach taken 
in this work. 

Lemma 2.5. The map Q preserves ergodicity in the sense that if fj, 6 
Ai\ is an ergodic invariant measure for the SDS A, then Q[i is an ergodic 
invariant measure for the shift map on C(R + ,X). 

Proof. This is an immediate consequence of the general fact that if T 
and T are two measurable transformations on measure spaces £ and £ and 
there exists a measurable map / : £ — > £ such that / o T = T o /, then if a 
measure fx is ergodic for T, f*\i is ergodic for T. □ 

Remark 2.6. As a consequence of the above result, if € M\ are 
two ergodic invariant measures for the semigroup Qt , then either Qfi = Qv 
or Q/i and Qv are mutually singular. 

3. An abstract ergodicity result. The main motivation of this section 
is provided by the following well-known facts from the theory of Markov 
processes. Recall that a Markov process on a Polish space X with transition 
probabilities Vt is called topologically irreducible at time t if P t (x,A) > 
for every x € X and every open set A C X. We call it simply topologically 
irreducible if there exists such a time. 

It is called strong Feller at time t iiVttft is continuous for every bounded 
measurable function tp : X — > R. Here, we abused notation and again used the 
symbol Vt to denote the corresponding semigroup acting on observables. It 
is immediate that the strong Feller property is equivalent to the continuity 
of the function x i— > Vt{x, •) if the space of probability measures on X is 
equipped with the topology of strong convergence. A standard result often 
attributed to Doob and Khas'minskii states the following. 

Theorem 3.1 (Doob-Khas'minskii). If a Markov process on a Polish 
space X with transition probabilities Vt is topologically irreducible and strong 
Feller, then it can have at most one invariant probability measure. 

In this section we introduce the strong Feller property and irreducibility 
in the abstract framework of SDS as laid out in the previous section. As 
already pointed out in [9], the strong Feller property as stated above is 
actually not easily amenable to generalization, mainly because the topology 
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of strong convergence of measures is not metrizable. Instead of generalizing 
the notion of continuity of the transition probabilities in the topology of 
strong convergence, we will thus follow the approach laid out in [9] and 
provide a generalization of the notion of continuity in the total variation 
topology. 

In this section, we consider, as before, a general SDS A on a Polish space 
X with stationary noise process (W, {Vt}t>o, P w , {Ot}t>o)- Remember that 
we introduced a linear map Q from M\(X x W) into A4i(C(R+, X)) con- 
structed as the law of the process on X with a given initial condition. Denot- 
ing by Rt:C(R + ,X) — > C([t, oo), X) the natural restriction map, we define 
the sets 

Mfo = {(w, w) G W 2 \R* t Q8 (XjW) ~ R* t QS (x>a) Vx G X}, 
N l x = {(x,y,w) G X 2 x W\R* t Q5^ w) JLR* t Q5 M }, 

M l = {(x,y,w,w) £ X 2 x W 2 \{w,w) S AAyy and (x,y,w) eM%}- 

Here and in the sequel, we write \i ~ v to denote that two measures \x and 
z/ are mutually absolutely continuous and fj, _L v to denote that they are 
mutually singular. We will also use the notation fj, < v as a shorthand for 
"/x(j4) < ^(^4) for every measurable set A." 

Note that, beside the symmetries obvious from the definitions, the set M l 
has the property 

(x, y, w, w) € Af l — > (x, ?/, tTj, € AA*. 

Note, also, that in the Markovian case, Q5( x >w \ is independent of w, so 
A/yy = W 2 and A/^ can be considered as a subset of X 2 for every i > 0. 

Recall that a coupling between two measures /i and v on a space <Y is a 
measure 7r on <Y 2 such that tx^A x X) = fJ,(A) and ir(X x A) = v{A) for every 
measurable set A C X . In the same spirit, we will say that 7r is a subcoupling 
for fi and i/ if 7r(A x X) < fj,(A) and tt(X x A) < v{A). 

Consider the map 

A t : X 2 x W 2 -> X 2 x W 2 , 

defined as A t (x,y,uj,uj) = (A t (x,u;), A t (y , Co) , to , &) for (x,y) € X 2 and (w,u>) G 
W 2 . We will abuse notation by also writing A(x, y) for the map from W 2 to 
X 2 x W 2 obtained by fixing the first two arguments. 

With this notation in place, the abstract result laying the foundation for 
the present work is the following. 

Theorem 3.2. Let A be as above and assume that there exists a time 
t > and a jointly measurable map 

(x, y, w) i ^ V?' y (w, •) G M + (W 2 ) 

with the following properties: 
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1. the measure V^' y (w, ■) is a subcoupling for Vt{w, ■) and Vt{w, ■) for every 
(x,y,w) e X 2 x W; 

2. there exists s > such that 

(3.1) (A t (x,y)*V?' y (w,-))(Af s )>0 

for every (x,y,w) £ X 2 x W. 

Then A can have at most one invariant measure (up to the equivalence 
relation of Definition 2A). 

Proof. Assume, by contradiction, that fj, and v are two distinct ergodic 
invariant measures for the SDS A such that Qfi ^ Qu. We claim that there 
exist nonzero positive measures fl, v and v on W x X such that 

K Q» > K IKQv- K & < K 

If we are able to construct such measures, it follows immediately that R* S Q^ 
and R^Qf are not mutually singular, thus leading to a contradiction, by 
Lemma 2.5. Let us consider the finite measure At(x, y)*V t (w, •) on W\ x 
W2 x^x X 2 , where (Wi, W2) and (X 1 ,X 2 ) denote two copies of W and X, 
respectively. By assumption, there exist times s > and t > such that 

(k t (x, y yr^(w,-))(M s )>o 

for every (x,y,w) € X 2 x VV. This shows that the measure 0(v> u ) Q n A/" s 
defined by 

9^\A):= [ [ (A t (x,yyV?> y (w r ))(AnM s )fi(La,dxnLU,dy)P w (du;), 

is not identically 0. Here, fi(uj, •) and v(u), •) are the disintegrations of (i and v 
with respect to P„,. By using the hypothesis that V^' y (w, •) is a subcoupling 
for Vt{w, •) and Vt{w, •) for every (x,y,w) £ X 2 x W, it follows immediately 
from the invariance of [i and v that /i := II£y fl^'^ and v := II£y ^ fl^' 1 ^ 
are smaller than \jl and 1/, respectively. Let us now consider the measure 
v ■= Uy^ lXX ■ 9^' v ) on W x X. The definitions of v and v yield 

i?*Q*7 = / R*Q6 (ytW) e^ v \dx,dy,du;,du;), 

R*Qu= [ RlQ6 (yCj) e^> u \dx,dy,du>,du). 

since (u,u) GA/^, it follows that R* s Qv ~ R*Q9 . 

It remains to prove that R* s Qjl JL R* s Qv. To see this, we observe that 
T := IT^y/f = n^yi/ and therefore, by the triangle inequality and the fact 
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that the measures fl and v give full measure to M s , one has the inequality 
\\R* s Qjl-R*Qv\\Tv 

< / / \\R*Q5 {x) -R*Q5 { ) \\ TV fi(uj,dx)u{uj,dy)T{duj) 
JwJx 2 

< 2T(W) = II^QAIItv + H^QpIItv, 

where fl(u, •) and v(uj, •) are disintegrations of /i and respectively, with 
respect to T. The strict inequality from the first to the second line is an 
immediate consequence of the fact that the integral can be restricted to 
without changing its value. The claim R* s Qji JL R* s Qv is then a consequence 
of the fact that if [i and v are any two positive measures, then [i _L v if and 
only if ll/i - z/||tv = 1 1 A* 1 1 TV + IMItv- 

Finally, note that since \x < \x and v < v by definition, we have R*Qjl < 
RsQfJ- and R* S QD < R*Qv. This completes the proof of the theorem. □ 

The conditions of Theorem 3.2 do not appear to be easily verifiable at 
first sight. The remainder of this section is devoted to providing useful char- 
acterizations on the dynamics generated by an SDS A on the state space X 
which give sufficient conditions for the assumptions in Theorem 3.2 to hold. 
It turns out that such properties are analogous to the strong Feller property 
and topological irreducibility in the Markovian setting. 

Definition 3.3. An SDS A is said to be strong Feller at time t if there 
exists a jointly continuous function £: X 2 x W — > R+ such that 

(3-2) \\RtQ5( x ,u>) -RtQ8 M \\ TW <t{x,y,u;) 

and £(x, x, oS) = for every x £ X and every uj G W. 

Remark 3.4. If the process is Markov in X, then the total variation 
distance between i2£ Q5( Z)W ) and R%QSr y>u \ is equal to the total variation 
distance between the transition probabilities at time t starting from x and 
y, respectively. Therefore, Definition 3.3 reduces in this case to the statement 
"the transition probabilites at time t are continuous in the total variation 
topology." This implies the usual strong Feller property but is not equivalent 
to it. However, it can be shown that if a Markov semigroup is strong Feller 
at time t, then the corresponding transition probabilities at time It are 
continuous in the total variation topology. This implies that for our purpose 
(where we are only interested behavior at large times anyway), Definition 3.3 
is equivalent to the usual strong Feller property in the Markovian case. 

Definition 3.5. An SDS A is said to be topologically irreducible at time 
t if for every x £ X, uj £ W and every nonempty open set U C X, one has 
Q t (x,Lu;W x U)>0. 
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Remark 3.6. Since the dynamics which we are interested in take place 
in the state space X , we do not generally require that the underlying Markov 
process generated by the semigroup Qt be irreducible in the usual sense. In 
fact, the above definition is much weaker than irreducibility of the Markov 
semigroup Qt- 

In the sequel, we will use the following notation. If \i is a finite measure 
on a measurable space (Y,B(Y)) and O € B(Y), then we write n\o(A) := 
/j>(A n O) for A € B(Y). Next, we introduce a class of SDS which plays an 
important role in the theory. 

Definition 3.7. An SDS A is said to be quasi- Markovian if for any two 
open sets V, U in W and for every t, s > 0, there exists a measurable map 
u i > Vj' u {u, ■) e M+{W 2 ) such that: 

(i) the measure VY' U {u, •) is a subcoupling for V s (oj,-)\v an d T'sO^jOlt/ 
for every uj € W, 

(ii) one has T>Y' U {u; Ayy) > for every cj such that mm{V s (uj;V), 
V s (u;U)}>0. 

Remark 3.8. The terminology quasi- Markovian is motivated by the 
following fact. The process on X generated by the SDS A is a Markov process 
in its own filtration precisely if Q6( XiU> ) is independent of u. In this case, 
A/yy = W 2 for every t > and one can simply choose VY ,u (lu; •) = V s (u; -)\v x 
V s (ui; in the definition above so that A is also quasi-Markovian. 

Remark 3.9. Definition 3.7 depends very weakly on the choice of the 
SDS A. It is weak in the sense that it does not take into account the fine 
details of the dynamics of A, but only how the noise enters the system. For 
example, we will show below that the solution to any SDE driven by fBm 
is always quasi-Markovian, without any restriction on the coefficients of the 
equation (1.1) other than what is required to obtain a well-posed SDS. 

The last result of this section, which is the main abstract result of the 
present work, combines these definitions into a general criterion for an SDS 
to have a unique invariant measure. It is the analogue in our non-Markovian 
setting of Theorem 3.1 for Markovian systems. 

Theorem 3.10. If there exist times s > and t>0 such that, a quasi- 
Markovian SDS A is strong Feller at time t and irreducible at time s, then 
satisfies the assumption of Theorem 3.2. In particular, it can have at most 
one invariant measure. 



ERGODIC THEORY FOR SDES WITH EXTRINSIC MEMORY 



11 



Proof. Since W is Polish, there exists a countable dense subset {ro„} n >o 
and a metric dyy generating the topology of W. Given this, we denote by 
{Vn}n>o the countable collection of open balls with l/2 k and center w m for 
all pairs of positive integers k and m. We also choose a metric dx on X. We 
will henceforth denote by B*(x) C X the open ball of radius p centered at 
x and similarly by B^(w) C VV the open balls in W. 

In order to verify the assumptions of Theorem 3.2, our aim is to find a 
measurable function 

(3.3) (x,y,w) i-> (n x ,n y ) 
and to define 

(3.4) V*>y(LO,-) = Vs" x ' Vny (LO,-), 

where the right-hand side uses the family of subcouplings from Definition 3.7. 

Note, first, that it is possible to construct a measurable function / : W — ► 
W with the property that f(u) € suppV s (uj, •) for every to € W. One way of 
constructing it is to define 

n 1 (u) = m£{n\V a (u,B] v (w n ))>0} 

and then, recursively, 

n k (uj) = mi{n\V s (u;,B™ k (w n )) >0 and w n G B 2 i-k {w nk -i(w))}- 

It follows from the density of the Wk that nk(to) < oo for every k and every u. 
It follows from the construction that the sequence w nk {u>) 1S Cauchy for every 
uj € W, so we can then set f(w) = linn^oo w nk ( u ) ■ Since, by construction, 
V s (o~>, Bp(f(uS))) > for every p > 0, the function / indeed has the required 
properties. 

Define the map x:XxW^Wby 

x(x,u) =A s (x,f(uj)), 

as well as the measurable map r : X x W — > W by 

(3.5) r(x,uj) = sup{p\£(x ,x(x,u),uj ) < 1/4 for all (x' ,uj') £ ^ p (x,u;)}, 

where we use £> p (:r,u;) as shorthand for £>^(x(;c, w)) x B^(f(uj)). Since the 
function £ in Definition 3.3 is jointly continuous and vanishes on the diagonal, 
one has r(x,iv) > for every x € X and every oj £ W. 

Given these objects, we are now in a position to define n x and n y . Consider 
(x,y,uj) to be given and set p = r(x,u>), u = f(u>) and x = A s (x,a)). We set 

n x = mf{n\V n C B^(u), OeV n and A,(x, V n ) C B?(x)}, 

(3.6) 

n y = M{m\V s (to, V m ) > and k s (y,V m ) C B%(x)}. 
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Since one has A s (x,uj) = x by definition, it follows from the continuity of A s 
and the definition of the V n that n x is finite for every possible value of x and 
u. The fact that n y is finite for every possible value of x, y and uo follows 
from the topological irreducibility of A. This shows that Vg' y (uj, •) as given 
by (3.4) and (3.6) is a map satisfying the first assumption of Theorem 3.2. 

It remains to show that (3.1) also holds. Since Co £ V nx , one has V s {uj, V nx ) > 
0. Furthermore V s (w, V ny ) > 0, by construction, so ' y (oo, A/yy) > 0. It fol- 
lows from the definition of p that one has 

\\RtQ$(x, w ) - R$ S%,i«)IItv - h 

for every (x, y) € B*{x) 2 and every w 6 V Ux . It follows immediately from the 
definition of A^ w that 

(3-7) R* t Q5 {StWx) JLR* t Q5 {V:Wy) 

for every (x,y) as above, every w x € V nx , and every w y such that (w x ,w y ) £ 
A/" t w . Since one has A(x,oj x ) E Bp(x) and A(y,u! y ) 6 Bp(x) for every (w^,^) G 
F ni x V ny , (3.1) is now a consequence of (3.7) and of condition (ii) in Defi- 
nition 3.7. □ 

The remainder of this article is devoted to showing that it is possible to 
associate an SDS to (1.1) in such a way that the assumptions of Theorem 3.10 
are satisfied. 

4. Construction of the SDS. In this section, we construct a continuous 
SDS induced by the SDE (1.1) in the sense that for every generalized initial 
condition fi, the probability measure Qfi on the path space is an adapted 
solution to (1.1). This will then allow us to investigate ergodic properties of 
the SDE (1.1) according to the program laid out in the Introduction. 

In this work, we will make use of the well-known Mandelbrot-Van Ness 
representation of the fBm [14]. The advantage of this representation is 
it is invariant under time-shifts, so it is natural for the study of ergodic 
properties. We may represent the two-sided fBm Bh with Hurst parameter 
H € (0,1) in terms of a two-sided Brownian motion B as 

(4.1) B H (t) = a H [° (-r) H -^ 2 (dB{r + t)-dB(r)) 

for some an > (see [21] for more details). 

4.1. Noise space and the stationary noise process. The aim of this section 
is to define the stationary noise process which will be used to investigate 
ergodic properties of the SDE (1.1) as laid out in Section 3. The main step 
is to consider a suitable Polish space in such a way that: 
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(1) there exists a Feller semigroup on the noise space which admits the 
fractional Brownian motion measure as its unique invariant measure; 

(2) the solution map of the SDE (1.1) is continuous with respect to both 
the driving noise and initial conditions on R d . 

One should note that such properties are closely related to the topology given 
on the noise space. In particular, if we consider white noise (or fractional 
noise with H < 1/2), property (2) could not be realized on any conventional 
space of paths, but one would have to work with rough paths instead [3]. 

The remainder of this section is devoted to choosing a topology which 
realizes (1) and (2). At first, one should note that by the properties of the 
fBm, it is convenient to work with Polish spaces defined by some norm which 
captures the Holder continuity on bounded intervals and, at the same time, 
gives some kind of regularity at infinity. For this purpose, if 7 6 (0, 1) and 
5 € (0, 1), then we define W( 7j 5) as the completion of Cq°(R_; R) with respect 
to the norm 

(4.2) IM| 7 ,i= sup 



M£R- |t-s| 7 (i + l*l + Mr 

We write W( 7j j) for the corresponding space containing functions on the 

positive line instead of the negative one. We also write WV 7 5) t an d W( 7)1 5\ t 
when we restrict the arguments to the intervals [— T, 0] and [0, T], respec- 
tively. It should be noted that || • \\y s,T is equivalent to the Holderian norm 
|| • || 7 for every < T < 00. Moreover, W( 7i 5) is a separable Banach space. 
The following lemma states that there is a Wiener measure on W( 7)( 5) for 
the fBm. 



Lemma 4.1. Let H e (1/2,1), je(l/2,H) and 7 + 5 £ (H,l). There ex- 
ists a Borel probability measure Ph on W( 7)< 5) such that the canonical process 
associated to Ph is a fractional Brownian motion with Hurst parameter H. 

Proof. One can show, as in [7], that the set of all continuous functions 
w with ||u;|| 7 /^/ < 00 for some 7' > 7 and some 5' such that 5' + 7' < 5 + 7 
is contained in W( 7>< s). The claim then follows from Kolmogorov's criterion 
and the behavior of fractional Brownian motion under time inversion. □ 



One can similarly show that the two-sided fractional Brownian motion 
can be realized as the canonical process for some Borel measure Ph on 
W( 7i<5 ) x W (7j5 ). 

Consider now the operator A defined by 

(4.3) Au(t) = fa j™\g(^u>(-r)dr, 
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where g is given by 

g(x) = x*-W + {H - 3/2), ( ^TZ du 

and Ph = (H — 1/2)olh(X-i-h- It is shown in Proposition A. 2 below that A 
actually defines a bounded linear operator from W(y,s) ~^(j,S) ■ 

It will be convenient in the sequel to make use of fractional integration and 
differentiation. For a € (0,1), we define the fractional integration operator 
X a and the corresponding fractional differentiation operator V a by 

(4.4) r<a,7 ° ( 

For a comprehensive survey of the properties of these operators, see [20]. 
The most important property that we are going to use here is that I a and 
D a are each other's inverses. Furthermore, denoting by : w i— > w + h the 
shift map on W( 7)< 5), we have the following result. 

Lemma 4.2. Lei TC(w,-) be the transition kernel from W^j) to W(~ } §\ 
given by 

where W is the Wiener measure over R + . Then TC is the disintegration of 
Ph with respect to Ph- 

Proof. This is a lengthy, but straightforward, calculation, using the 
representation (4.1) for the fractional Brownian motion. □ 

We are now a in position to define our stationary noise process. For this, 
let us consider the one-sided Wiener shift 9 t : W( 7i< 5) —>■ W( 7)< 5) defined by 

(4.5) 6 t uo{s):=uj(s -t) -uj(-t), s€R_,teR + . 

In order to construct the transition semigroup on W( 7j 5), we also introduce 
the "concatenation" function M t : W( 7)< 5) x Wf-y^s) ^(j,S) defined by 

(a R \ mi ~\def (cu(s + t) -u(t), if — i < s, 

(4.6) MAuj.uj) = { ) , A ~).; . r ^ 

y ' t\ , j \u(s + t) -u(t), ifs<-i<0. 

With these definitions at hand, we set 

(4.7) P t (u, •) := M*{uj, -)H(lo, •) for cj G W (7j5) and t G R + . 
We are now in a position to state the following result. 
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Lemma 4.3. The quadruple (W( 7)( $), {Vt}, Pjj, {&t}) is a stationary noise 
process. 

Proof. It is obvious from the definition that M± is continuous from 
W( 7i 5) x VV( 7> 5) to W( 7)< 5) . Moreover, the operator .A is continuous from WV 7) a) 

to W( 7)< 5). Therefore, we may conclude that Vt(w,-) is a Feller semigroup 
on W( 7) (5). The fact that its only invariant measure is given by P# is a 
straightforward calculation. All the other properties follow immediately from 
the definitions. □ 

4.2. Definition of the SDS and existence of an invariant measure. Recall 
that we are concerned with the multidimensional SDE 

(4.8) X t = X + [ f(X s )ds+ [ a(X s )dB H (t), 1/2 < H < 1, 

Jo Jo 

where the integral with respect to Bh is a pathwise Riemann-Stieltjes in- 
tegral. This kind of equation has been studied by several authors (see, e.g., 
[3, 15, 18]) using different approaches, but we will mainly use the regularity 
results from [15]. 

Note that we actually need d independent fractional Brownian motions 
to drive (4.8), so we consider d copies of the stationary noise process de- 
fined in Lemma 4.3. With a slight abuse of notation, we again denote it by 
(W( 7j 5) , {Vt}, P#, {#*})■ We define the continuous shift operator TZt ■ W( 7i 5) — 
>V( 7 ,5),T by (Jl T h)(t) := h(t — T) — h(-T) and set 

(4.9) A : R+ x R d x W( 7>S) -» R d , 

defined by A t {x,u) := i> t (x,Ktu)(t), where $ t :R d x W t -> C([0, t], H d ) is the 
solution map of equation (1.1) which depends on the initial conditions and 
the noise. 

PROPOSITION 4.4. Let A be the function defined in (4.9). Then A is a 
stochastic dynamical system over the stationary noise process (WV 7i< 5), {Vt}, 
P/f) {#*})• Moreover, for every generalized initial condition //, the process 
generated by A for fi is an adapted weak solution of the SDE (1.1). 

Proof. The regularity properties follow from [15], Theorem 5.1, and 
the fact that || ■ || 7 ,5t is equivalent to the Holderian norm || • |L for every 
< T < oo. The cocycle property is a direct consequence of the composition 
property of ODEs since we are dealing with pathwise solutions. Furthermore, 
it is obvious from Lemma 4.2 that every process generated by a generalized 
initial condition from A is a weak solution of equation (4.8). 
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The adaptedness of the solution is a consequence of the construction since 
the transition probabilities Vt of the noise process do not depend on the 
solution x. □ 

To conclude this section, we study the problem of existence of the invariant 
measure for equation (4.8). The main difficulty in proving it comes from 
the pathwise stochastic integral. Suitable bounds on the stochastic integral, 
together with a dissipativity assumption, are sufficient to ensure existence. 
More specifically, in order to prove existence of the invariant measure, we 
make use of a Lyapunov function in the following sense. 

Definition 4.5. We say that V :H d — > R + is a Lyapunov function for 
A if V _1 ([0, K\) is compact for every < K < oo and there exists a constant 
C and a continuous function £ : [0, 1] — > R+ with £(1) < 1 such that 

/ V(x)Q t fi{dx,dw) <C + f(t) J V{x)fi(dx,dw) 

for every t € [0,1] and every generalized initial condition fi. 

Note that this definition is slightly different from the one given in [7] , but 
it is straightforward to check that the Krylov-Bogoliubov criterion never- 
theless applies, so the existence of a Lyapunov function ensures the existence 
of an invariant measure. 

Proposition 4.6. Assume that the hypotheses (HI) and (H3) hold. 
Then for every p>l, the map x i— > \x\ p is a Lyapunov function for the SDS 
A defined above. Consequently, there exists at least one invariant measure 
for equation (4.8) and this invariant measure has moments of all orders. 

Proof. The proof follows closely the proof of [15], Proposition 5.1, but 
we keep track on the dependence of the constants on the initial condition. Fix 
an arbitrary initial condition xq € R rf and a realization Bn of the fractional 
Brownian motion with Hurst parameter H. We define xt and z% on t £ [0, 1] 
by 

dx t = f(x t ) dt + a{x t ) dB H (t), dz t = f(z t ) dt, 

where the initial condition for xt is are given by xq and the initial condition 
for z t is also given by xo ■ We also define yt = xt — zt so that 

Vt= I (f(y s + z s )-f(z s ))ds+ f a{y s + z s )dB H {s)=:F t + G t . 
Jo Jo 

Fix two arbitrary values a € (1 — H, 1/2) and (5 G (1 — a, H). Following [15], 
we define, for t E [0, 1] , 

r t I — y$\ 

ht = \vt \ + / i r~rr ds 

]y i Jo |t-s| a+1 
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and verify that h satisfies the conditions of the fractional Gronwall lemma 
[15], Lemma 7.6. Note, first, that the global Lipschitz continuity of / implies 
that 

Here and in the sequel, C denotes a generic constant depending only on a, 
P, f and a. In order to bound Gt, first note that since a is bounded, 

\a(z s + y s ) - a(z r + y r )\ < C\z s - z r f + C\y s - y r \. 

Also, note that the dissipativity condition on / ensures that \z s — z r \ < 
C\s — r|(l + \xq\), so that 

(4.10) \a(z s + y s ) - a(z r + y r )\< C\s - rf(l + \x f) + C\y s - y r \. 
It follows, in the same way as in [15], Proposition 5.1, that 

< C\\B H \\ f (l + |x„|" + ££ drds). 
One similarly obtains the bound 

Combining all of the above yields for h that 

M < CUBhII^ + \x Q f + J\l + (t- s)~ a )h s ds 

< C\\B H y (l + \x f + J\t - s)- a t a s~ a h s ds* 
The fractional Gronwall lemma [15], Lemma 7.6, then implies that 

\ht\ < c\\b h \W + eMC\\B H \\)! {1 ~ a) t) 

for every t G [0, 1] . Furthermore, the dissipativity condition (H3) ensures the 
existence of 7 > such that \zt\ p < e _7i |xo| p + C. Combining these bounds 
shows that for every r\ > 0, there exists a constant C such that 

\x t \ v < (1 + V)e-^\x \ p + Cexp(C\\B H \\l /{1 - a) t). 

Since ||-Bff||/3 is almost surely finite and Bh is a Gaussian process, it has 
Gaussian tails by Fernique's theorem. This shows that there exists a constant 
C such that, for every t E [0, 1], one has the bound 

/ \x\ p (Q t fi)(dx,dw) < (1 + V )e~^ J \x\ p n(dsc,dw) + C, 



18 



M. HAIRER AND A. OHASHI 



uniformly over all generalized initial conditions fi. Since r\ was arbitrary and 
affects only the value of the constant C, one can choose it in such a way 
that (1 + rf)e~^ < 1, thus concluding the proof of Proposition 4.6. □ 

5. Uniqueness of the invariant measure. In order to simplify our nota- 
tion, we fix once and for all 7 € (1/2, H) and 5 > such that H < 7 + 5 < 1 
and W we jdenotejay the noise space^with these indices. We also use the 
shorthand W for W( 7)<5 ) and W T for W( 7)5 ) jT . 

The main goal of this section is to prove that the strong Feller prop- 
erty defined in Section 3 holds for the solutions to equation (4.8). This 
property, together with an irreducibility argument, will then provide the 
uniqueness of the invariant measure for our system. In the Markovian case, 
one efficient probabilistic tool to recover the strong Feller property is the 
Bismut-Elworthy-Li formula [5]. The main feature of this formula is that 
it provides bounds on the derivatives of a Markov semigroup which are 
independent of the bounds on the derivatives of the test function. In the 
non-Markovian case, one would expect to recover the strong Feller property 
by using a similar idea. In the language of the present article, given a mea- 
surable function ip : C([l, 00); H d ) — > R, a Bismut-Elworthy-Li type formula 
would be an expression for the derivative (in the x variable) of the function 
Qtp : K d x W -> R defined by 

(5.1) Qtp(x,w):= [ v{z)R\Q5 {XiW) {dz) 

JC([l,oo),R d ) 

that does not involve any derivative of ip. 

The main technical difficulty one faces when trying to implement this 
program is that it seems to be very hard to prove that the Jacobian Jo t of 
the flow has bounded moments. We will overcome this by a cutoff procedure 
in Wiener space. The price we have to pay is that we are not able to show 
that Q(p is differentiable in x, but only that it is continuous in x for any given 
w € W and that its modulus of continuity can be bounded by a function of 
|x| and ||^||( 7i 5) only, uniformly over <p, with ||v||oo < 1- This is, however, 
sufficient for Definition 3.3 to apply. 

We will need some basic lemmas concerning the smoothness of the solu- 
tions with respect to their initial conditions and with respect to the noise. 
We begin with an elementary regularity result. For sake of completeness, we 
give the details here. As in the previous section, we denote by 

$t ■ R d x Wt -> W t 

the solution map for (1.1). (The fact that its image actually belongs to Wt 
and not only to C([0,t],R d ) is a consequence of Proposition 4.6 and of the 
regularity results from [15].) The main regularity result used in this section 
is the following. 
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Lemma 5.1. Assume that the coefficients of the SDE, a and f , satisfy 
hypotheses (HI) and (H2). Then the map $t is differ entiable in both of its 
argument for each fixed T > 0. 

Define the matrix-valued function J t = (D x ^T(x,w))(t). Then J t and its 
inverse Jf 1 , respectively, satisfy the equations 

(5.2) J t = I+ f* Df(x s )J s ds + Y] f Da k (x s )J s dw k (s), 

Jo j^Jo 

(5.3) Jf 1 = I~ f J^Dfixs) ds-Y,f J^Da^Xs) dw k (s), 

Jo Jo 



k=V 



where I is the dxd identity matrix and where x s is shorthand for (§>t(x, w))(s). 
Here and below, we also defined a k by {a k )i =Cfc,i- 

For any given v € W, define the process K% = (D^T(x,w)v)(t) . Then 
Kf satisfies 

(5.4) K v t = f Df{x s )K v s ds + Y j f Da k {x s )K v s dw k {s)+ f a(x s )dv{s). 

Jo fc=!y o J ° 

In particular, defining J Sjt = Jt.J^ 1 , the equation 

K v t = I J s , t o-{x s )dv{s) 
Jo 

holds. Furthermore, for every bounded set A C R d of initial conditions and 
every C > 0, there exists K > such that 

||$ T (x,u))|| 7 + ha^tOz,™)^ < K 

for every x G A and every w with \\w\\-y < C . 

Proof. We know that xt = (^T(x,w))(t) is the unique solution to the 
equation 

x t = x+ \ f(x s )ds+ / a(x s )dw(s). 
Jo Jo 

We can write this in the form &t( x ,w) = Mt( x , w, &t( x , w)), where 
M.t '■ R d x VVt x Wt — » Wt is a continuously differentiable map in all of 
its arguments. 

Therefore, the claim follows from the implicit function theorem if we can 
show that the derivative of Mt in its last argument is of norm strictly 
smaller than 1. This is not true in general, but it holds for T sufficiently 
small. The result then follows from the a priori bounds of Proposition 4.6, 
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together with a standard gluing argument; see also [16] for a more detailed 
proof. 

The equation for J^ 1 is an immediate consequence of the chain rule. The 
last expression for K% is a consequence of the variation of constants formula. 
□ 

As usual with probabilistic proofs of regularization properties, the main 
ingredient in the proof of the strong Feller property will be an integration by 
parts formula. Since the point-of-view taken in most of this work is to study 
the solutions to (1.1) conditioned on a realization of the past of the driving 
fBm, the natural Gaussian space in which to perform this integration by 
parts will be the space W endowed with the Gaussian measure TC(0, ■)■ Note 
that it follows from (4.1) and the definition of TL that the law of the canonical 
process w under 7i(0, •) is the same as the law of I H ~ l / 2 w under the Wiener 
measure W. Given any function F on W, we can thus associate to it a 
function F = F oX H ~ 1 ^ 2 on Wiener space. Note, also, that the reproducing 
kernel space ICh of 7i(0, •) is precisely equal to those functions v such that 
jyH~i/2 y b e i on g S t th e reproducing kernel space /C of W. This allows us to 
carry over the whole formalism of Malliavin calculus [17]. 

Note that the properties of the fractional derivative and integral oper- 
ators (4.4) are such that the notion of "adaptedness" with respect to the 
canonical process or the transformed process 

jH-l/2 

agree, so a process b t 

is adapted to the increments of the canonical process if and only if F% is 
adapted to the increments of the canonical process. This allows us to speak 
of an adapted process in this setting without any ambiguity. In particular, 
the Malliavin derivative 3>F with respect to an adapted /C^-valued process 
v can be defined in a natural way by the equality 

(9F,v) = {@F,V H ~ l / 2 v)oV H - l l 2 . 

In particular, if F is Frechet differentiable with Frechet derivative DF, it is 
also Malliavin differentiable and we have the equality 

(5.5) (DF,v) = (9F,v), 

where ICh is identified with a subspace of W in the usual way. 

The following bound is an immediate consequence of the integration by 
parts formula from Malliavin calculus, together with the representation (4.1) 
of fractional Brownian motion. 

Theorem 5.2. Let F,G:W be Malliavin differentiable functions 

such that FG, F\\@G\\ic H an d G\\SiF^\jc h are square integrable. Further- 
more, let v :W — > ICh be an adapted process such that \\v\\k, h is square inte- 
grable. Then one has the bound 

(5.6) |E((^F, w )G)|<(E||,;||| ;h ) 1 / 2 ((EF 2 G 2 ) 1 / 2 + (EF 2 ||^G||^) 1 / 2 ). 
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In the assumptions and the conclusion, expectations are taken with respect 
to W(0,-). 

Proof. Since the assumptions ensure that both sides of (5.6) are finite, 
a standard density argument shows that it suffices to check (5.6) under 
the assumption that F, G and v are in the space D°° of Malliavin smooth 
functions with bounded moments of all orders; see [17] for this notation. 

Denoting by 5 the adjoint of the derivative operator in L 2 (W,7Y(0, ■)), 
we then have 

E(@FG,v) =E(FG5v). 

On the other hand, one has QFG = GSsF + FStG, so 

E((@F,v)G) =E(FG6v) -E((9G,v)F). 

It now suffices to note that the adaptedness of v ensures that one can use 
the ltd isometry to get E|<5u| 2 = E||u|||- . □ 

This result allows us to show the following. 

Proposition 5.3. Assuming that (H1)-(H3) hold, the SDS constructed 
in the previous section has the strong Feller property of Definition 3.3. 

Proof. Denote by Wm^i the restriction of functions in W to the in- 
terval [1,T]. For every bounded measurable function ipx :W[i ) ti R, define 
Q</?:R d x W — > R by (5.1). The strong Feller property (3.2) follows if we 
can show that 

\Q<p T {x,w) - Qip T (y,w)\ <e(x,y,w), 

uniformly over all T > 1 and all bounded Frechet differentiable functions <p 
with bounded Frechet derivatives such that sup x \<f(X) \ < 1. 

Denoting by E m for simplicity, expectations over W with respect to the 
probability measure H(w, •), we have 

Qip T (x,w) =E w ip T (& T (x,w)). 

Setting z s = sx + (1 — s)z and ^ = x — y, this yields 

Qip T (y,w) - Q(p T (x,w) 
(5-7) i 

= E W / (D(p T ($ T (z s ,w)),D x $ T (z s ,w)0 ds. 
Jo 

The problem at this point is that we do not have the a priori bounds on 
the Jacobian D x &t that would be required in order to exchange the or- 
der of integration. This can, however, be overcome by the following cutoff 
procedure. 

Note that we have the following result. 
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Lemma 5.4. The maps : W — > R defined by 

\w(t) — w(s)\ 



Nt : w i— > sup 



belong to D 



2.1 



Proof. Note that one actually has Nt(w) = sup sVt < T u_X ■ The re- 
sult then follows from [17], Proposition 2.1.3, and the fact ([20], Theorem 3.1) 
that x H ~ 1 / 2 is bounded from H 1 to the space of 7-Holder continuous func- 
tions. □ 



Now, let x '■ R+ — * [0) 1] be a smooth function such that x( r ) = for r > 3, 
x( r ) = 1 for r < 1 and |x'( r )| < 1 fo r every r. Then the cutoff functions 
Xr,R' '■ w l— > x(Ni(w)/ R)x(Nt(w)/ R') all belong to D 2 ' 1 and one has the 
following, obvious, bound result. 



Lemma 5.5. There exists a constant C > such that for every T > 1, 
there exists R' T such that the D 2 ' 1 norm of xr r' is bounded by C for every 
R and every R' > R' T . 



Denoting by SQ(p™ y (as short hand) the left-hand side of (5.7), we get 
the bound 

\SQipl y \ < \E w ((1-Xr,r>)(<Pto$t)(x,-))\ 

+ |E W ((1 - XR,R>)(<PT o § T )(V, -))l 



+ 



dcf , 



E w Xr,r>(w) / (D<p T ($ T (z s ,w)),D x $ T (z s ,w)£) ds 



T2 + T3 



Since <p is bounded by 1, the first two terms in this expression are both 
bounded by 



(5.8) 



Ti + T 2 < 2H(w, {w\Ni(w) >Ror N T (w) > R'}). 



Concerning the last term, we can now use Lemma 5.1 to exchange the order 
of integration and obtain 



la 



B w (xR,R'(w)(Dip T (^ T (z s ,w)),D x ^ T (z s ,w)0)ds 



At this point, we use exactly the same trick as in the proof of the Bismut- 
Elworthy-Li formula [5] to transform the derivative with respect to x into 
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a Malliavin derivative with respect to the noise process w. Let h: [0, 1] — > R 
be any smooth function with supp/i C (0, 1) and f h(s) ds = 1 and set 

v x (t)= [ h(s)a~ 1 (x s )J s £ds, 



where x s and J s are as in Lemma 5.1. It then follows from Lemma 5.1 that 
one has = Jt£ for every t > 1. Therefore, 

(D x $ T (x,w)€)(t) = {D w $ T (x,w)v x )(t) 

for every t > 1. 

Since, on the other hand we assumed that ip does not depend on the 
solution of the SDE up to time 1, this implies, by (5.5), that 



(5.9) 



T 3 



E w (xr,r>(w)(@(vt o $t)(z 8 , -),v Zs ))ds 



Note, now, that it follows from Lemma 5.1 that 4jV z (t) is 7-Holder contin- 
uous with its norm bounded uniformly over z in a ball of radius 1 around x 
and over Ni(w) < SR. Since XR,R' vanishes for N\(w) > 3R, this shows that 
there exists a constant C(R,x) depending on R and x, but not on R', such 
that we can replace v in (5.9) by v, defined by 



dv 
ds 



(s A r) ds, 



where r is the stopping time defined as the first time that Hgt^lro.-r] II7 is 
greater or equal to C(R,x). This ensures that HJj'Olk — C(R,x) almost 
surely, while still being adapted. 

In order to apply Theorem 5.2, it thus suffices to note that the fact that 
4f^z s (t) = for t > 1 implies that there exists a constant C such that 



\v\\k 



roc 

h =J {V H+ ^v x {t)fdt<C 



d 

—i 
dt 



Again using the fact that ip is bounded by 1, we get, for T3, the bound 



E, 



d A 
dt V *° 



2\ 1/2 



(l + (E w ||^x^H 2 ) 1/2 )^ 



<C(i2,x)||x-y| 



for all y with \\y — x\\ < 1. Here, we used Lemma 5.5 to obtain the last 
bound. Note that this bound does not depend on R' and T, provided that 
R' is larger than the value R' found in Lemma 5.5. We can therefore let R' 
go to 00 and get 



\6Q<p% v \ < 2H(w, {w\N!(w) > R}) + C{R,x)\\x - y\ 
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for every y with \\y — x\\ < 1. Since both terms can be chosen to be continuous 
in w, R, x and y and since the first term tends to as R — > oo, the required 
bound follows at once. □ 

Remark 5.6. There is a direct relation between the integrability of 
\\o~ l (xt) JtC\\"/ and the continuity of RiQ5( XUJ ) in the total variation topol- 
ogy. In fact, if \\a~ 1 (xt)Jt£,\\- y has second moments, then R\Q5^ X ^) is not 
only continuous, but it is Lipschitz in R d . In fact, one then has the follow- 
ing generalization of the Bismut-Elworthy-Li formula: 

(5.10) D^ix, w) = E w ({ip o •) J™ V H - l / 2 h{s)o-\x s ) UdB(s)^j , 

where B is the Brownian motion obtained from w via is as 

in (5.1) and h is a smooth function with support in [0, 1] and which integrates 
to 1. The main difficulty in getting good a priori bounds on the Jacobian 
comes from the fact that the diffusion coefficient for the SDE satisfied by 
the Jacobian is not globally bounded. One can check in [15] that, in general, 
the Jacobian has finite moments in 7-Holder spaces for some 7 G (1/2, H) if 
H > 3/4. 

We conjecture that by using a Picard iteration, it should be possible 
to show integrability for the Jacobian in the supremum norm by realiz- 
ing the pathwise Riemann-Stieltjes integral as a symmetric integral in the 
sense of Russo and Vallois [19]. In this case, there is a representation of 
the Riemann-Stieltjes integral in terms of the Skorokhod integral plus a 
trace term involving the Gross-Sobolev derivative; see [1] for more details. 
This may allow sufficient improvement of the existing estimates to get (5.10) 
directly, without requiring any cutoff procedure. 

Remark 5.7. Between the completion of this article and its publication, 
Hu and Nualart [10] announced bounds on the Jacobian for SDEs driven by 
fractional Brownian motion with H > i . 

5.1. Proof of the main result. Similar to the Markovian case, the strong 
Feller property defined in Section 3 is not sufficient for uniqueness of invari- 
ant probability measures in the framework of SDS. In addition to the strong 
Feller property, we need an additional argument which provides the desired 
result of uniqueness. As discussed in Section 3, this will be achieved by 
means of an irreducibility argument jointly with the quasi-Markovian prop- 
erty for A. In general, irreducibility requires some kind of nondegeneracy of 
the diffusion term. As far as the quasi-Markovian property is concerned, it 
is a direct consequence of the properties of fractional Brownian motion. We 
first show that under (H1)-(H3), the SDS constructed above is topologically 
irreducible. 
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Proposition 5.8. Assume that the SDE (4.8) satisfies assumptions 
(HI) and (H2). Then the SDS A induced by the SDE is irreducible at time 
t = l. 

Proof. Since everything is continuous, the proof of this is much easier 
than that of the original support theorem [22] and we can use the same 
argument as in [11], for example. The invertibility of a implies that the 
control system 

± t = f(x t )+a(x t )u t , te [0,l],x eR d , 

isexactly controllable for every x$. This shows that the solution map <l?i : R d x 
VV — > TH d is surjective for every fixed value of the first argument. Since this 
map is also continuous, the claim follows from the fact that the topological 
support of H(w, •) is all of W (this is a consequence of the fact that it is a 
Gaussian measure whose Cameron-Martin space contains Cq° and is thus a 
dense subspace of W). □ 

In the sequel, we will use the following notation. If fi\ and \i2 are positive 
measures with Radon-Nikodym derivatives D\ and respectively, with 
respect to some common reference measure /i, we define the measure [i\ A fi 2 
by 

A^ 2 )(<ix) := mm{Di(x),D 2 (x)}fi(dx). 

Note that such a common measure fj, can always be found (take [i = fi\ + fi2, 
for example) and that the definition of [i\ A [12 does not depend on the choice 

of fl. 

Next, we prove that the SDS A defined in (4.9) is quasi-Markovian over 
the stationary noise process (W, {Pt}t>0i Ph, {#t}t>o)- The main technical 
estimate that we need for this is the following. 

Lemma 5.9. Let A be the operator defined in Lemma 4.2. Then 
V H+1 l 2 Ah G L 2 (R + ) for every h such that h! € Of (R_). 

Proof. Note that a simple change of variables yields, for Ah, the for- 
mula 

(5.11) (Ah)(t)= f°° 9 M-h{ t /y)dy, 

Jo y 

where we set h(t) = h(—t) for convenience. This shows, in particular, that 
Ah is smooth. Therefore, we have, for D H+1 ^ 2 Ah, the expression 

(5.12) V H +^Ah(t) = | o °° ^j- 2 (V H -^h'){t/y) dy. 
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It follows immediately from the fact that h! € Cg that there exists C such 
that V H ~ l / 2 h' is a smooth function bounded by Cminjl, t H 1 / 2 }. 
Therefore, one gets the bound 

Using Lemma A.l, it is straightforward to get the bound 

\V H+1 / 2 Ah{t)\ < Cmmit- 1 ,^ 2 - 11 }, 

where the constant C depends on h. Since t~ l is square integrable at oo and 
f.i/2-H j s S q Uare integrable at 0, this concludes the proof. □ 

An immediate corollary of this is the following. 

Corollary 5.10. The set of pairs (w,w) such that w - w <G Cq°(R_) 
is contained in A/yy for every t>0. 

This eventually leads to the following result. 

Proposition 5.11. The SDS in (4.9) induced by the SDE (4.8) is quasi- 
Markovian over the stationary noise process (W, {Vt}t>o, ^H, {8t}t>o) de- 
fined in Lemma 4.3. 

Proof. Let us fix two nonempty open sets U, V in W and two times, 
s,t > 0. As in the proof of Theorem 3.10, it is straightforward to construct 
measurable maps fy and fy from W to W with the property that fu(w) € 
suppV s (w, -)nU for all w such that V s (w,U) > 0, and similarly for fy. Now, 
define a map w i— > e(w) by 

e(w) =sup{e > 0\B(fu(w),£) C U and B(f v (w),e) C V}, 

where we denote by B(w,r) the ball of radius r centered at w in W. Note 

that e(w) >0 on A d = {w\V s (w, U) A V s (w, V) > 0}. 

Note, also, that the support ofV s (w, •) consists precisely of those functions 
w such that w(t) — w(t + s) is constant for t < —s. Let ho(w) = fv(w) — fu(w) 
so that ho(w) is a function in W which is constant for times prior to 
— s. We now approximate /io by a smooth function h with h! £ Cq°. This 
can, for example, be achieved by choosing two positive smooth functions 
tp and Lp such that tp(t) = for t <£ [-2,-1] and fZ?ip(t)dt = 0. Further- 
more, 9?:R_ — > [0,1] is chosen to be decreasing and to satisfy <p(— 2) = 1 
and y(— 1) = 0. We then define 

K e h(t) = <p(t/e) f £ h(r)£-^((r-t)/e)dr. 

Jt~2e 
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It is easy to check that KL £ ho converges to ho strongly in W and that for 
every e > 0, the derivative of JC e ho has support in [— s + e,—e]. We now 
define h(w) = ICs^ w )ho(w), where 

8{w)=svLp{5>0\\\lC s ho(w)-ho(w)\\ <e(w)/2}. 

We thus constructed three measurable maps w h- > e(w), w i— > fjj(w), and 
w i— > h(w) such that, for every w £ A, the following properties hold: 

fu(w) G supp"P s (uv) n f7, 

(5.13a) 

B(fu(w),e(w)/2)(ZU, 
fu(w) + h(w) e suppP s (tt;, •) n V, 

(5.13b) 

B(fu(w)+h(w),e(w)/2) cV. 
Now, consider the maps I^I^WxW^WxW defined by 

X 1 ^, it)) = (u), tt) + h(w)), 

2 2 (w,w) = (w — h(w),w). 
With this notation, we define 

(5.14) V^ v (w, •) = {Z\w, -))*Vs(w, -)\u A (l 2 (w, -))*r.(w, 0|v 

It follows immediately from the definitions that V^' v (w, •) is a subcoupling 
for the measures V s (w,-)\U an d V s (w, -)\v ■ Since /i(w;) has its derivative in 
Cq°, it is straightforward to check that it belongs to the reproducing kernel 
space of V s (w,-), so (T l (w, ■))*V s (w^ ■) and (I 2 (w,-))*V s (w,-) are mutually 
absolutely continuous. This, together with (5.13), implies that V^ ,v (w, •) > 
for every w 6 A, as required. 

The fact that Vg' v (w,Ny V ) > is then an immediate consequence of 
Corollary 5.10. □ 

Combining all of these results, we can now prove the main "concrete" 
result of this article. 

Theorem 5.12. Under (H1)-(H3), there exists exactly one invariant 
probability measure for the SDS constructed in Section 4.2. 

Proof. The existence of such an invariant measure is ensured by Propo- 
sition 4.6. Its uniqueness follows from Theorem 3.10, combined with Propo- 
sitions 5.3, 5.8 and 5.11. □ 

Remark 5.13. Note that the solution to (1.1) obtained from the SDS 
constructed in Section 4.2 precisely coincide with the set of all adapted 
solutions to (1.1). Therefore, Theorem 1.1 is an immediate consequence of 
Theorem 5.12. 
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APPENDIX 

This section studies some of the properties of the operator A defined in 
(4.3). We first obtain the following bound. 

Lemma A.l. Let g : R + — > R be the function defined in Lemma 4.2. We 
then have g(x) = 0(x) for x <C 1 and g(x) = 0{x H ~ 1 ^ 2 ) for x S> 1. 

Proof. Since g is smooth at every x > 0, we only need to check the 
result for x 3> 1 and x <C 1. The behavior of g(x) for x 3> 1 is straightforward 
since 

x H-l/2 + {H _ 3 /2) ^-3/2 < < s H-l/ 2 Vx > 

In order to treat the case x <C 1 , we rewrite g as 
g{x) = Cix(l + xf- 1 / 2 + C 2 x f\u + x) H - b ' 2 {\ - (1 - u) 1 ^ 11 ) 

JO 

for two constants C\ and C2. Note that for x <C 1, the first term is 0(x), so 
|g(s)| <Cx + Cx / 1 ^- 5 / 2 (l-(l-«) 1 / 2 - /r )d«. 



Now, note that (1 — (1 — u) 1 ! 2 H ) = 0(u) for u ~ and that it diverges like 
(l-n) 1 / 2 -^ for tiwl. Since, furthermore, H > 1/2, the function appearing 
under the integral is integrable, so g(x) = 0(x). □ 

This allows us to show the following. 

Proposition A. 2. Let 7 and 5 be such that 1/2 < 7 < H and H <5 + 
7 < 1 . Then the operator A is bounded from W( 7)( $) into W( 7i( j) . 

Proof. Fix w E W( 7)( $) with ||iu||( 7)( 5) < 1 and consider two times s and 
t with s < i. Using (5.11), we obtain 

- aw{ S ) 1 < r g M c 1 + 1 ) 5 dy , 

Jo y y 1 V y/ 

so the claim follows if we can show that 

<5 



The left-hand side of this expression is bounded by 

Now, note now that it follows immediately from Lemma A.l that g(y)y a 
is integrable for every a S (H + |,2). This condition is satisfied for both 
a = 1 + 7 and a = 1 + 7 + 5, so the claim follows. □ 
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